Exploiting Multi-Features to Detect Hedges and their Scope in Biomedical Texts

نویسندگان

  • Huiwei Zhou
  • Xiaoyan Li
  • Degen Huang
  • Zezhong Li
  • Yuansheng Yang
چکیده

In this paper, we present a machine learning approach that detects hedge cues and their scope in biomedical texts. Identifying hedged information in texts is a kind of semantic filtering of texts and it is important since it could extract speculative information from factual information. In order to deal with the semantic analysis problem, various evidential features are proposed and integrated through a Conditional Random Fields (CRFs) model. Hedge cues that appear in the training dataset are regarded as keywords and employed as an important feature in hedge cue identification system. For the scope finding, we construct a CRF-based system and a syntactic pattern-based system, and compare their performances. Experiments using test data from CoNLL-2010 shared task show that our proposed method is robust. F-score of the biological hedge detection task and scope finding task achieves 86.32% and 54.18% in in-domain evaluations respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Rich Features for Detecting Hedges and their Scope

This paper describes our system about detecting hedges and their scope in natural language texts for our participation in CoNLL2010 shared tasks. We formalize these two tasks as sequence labeling problems, and implement them using conditional random fields (CRFs) model. In the first task, we use a greedy forward procedure to select features for the classifier. These features include part-ofspee...

متن کامل

Metadiscourse Use in Popular and Professional Science: The Case of Hedges and Boosters

The present article shows that all scientific texts included in journals, magazines, and newspapers are vulnerable to the penetration of hedges and boosters.  However, it was found that scientific texts in the three corpora tended to open up the possibilities of alternative voices rather than narrowing them down. The relatively higher frequency of occurrence of hedges in comparison with booster...

متن کامل

Learning to Detect Hedges and their Scope Using CRF

Detecting speculative assertions is essential to distinguish the facts from uncertain information for biomedical text. This paper describes a system to detect hedge cues and their scope using CRF model. HCDic feature is presented to improve the system performance of detecting hedge cues on BioScope corpus. The feature can make use of crossdomain resources.

متن کامل

The CoNLL-2010 Shared Task: Learning to Detect Hedges and their Scope in Natural Language Text

The CoNLL 2010 Shared Task was dedicated to the detection of uncertainty cues and their linguistic scope in natural language texts. The motivation behind this task was that distinguishing factual and uncertain information in texts is of essential importance in information extraction. This paper provides a general overview of the shared task, including the annotation protocols of the training an...

متن کامل

Descriptive Analysis of Negation Cues in Biomedical Texts

In this paper we present a description of negation cues and their scope in biomedical texts, based on the cues that occur in the BioScope corpus. We provide information relative to the ambiguity of the negation cue and to the type of scope, as well as examples. We show that the scope depends mostly on the part-of-speech of the cue and on the syntactic features of the clause. Although several st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010